[Storage Index Adapter] Fix index template creation on Serverless#264760
[Storage Index Adapter] Fix index template creation on Serverless#264760viduni94 merged 8 commits intoelastic:mainfrom
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
| # Disable the embedded Dev Console | ||
| console.ui.embeddedEnabled: false | ||
|
|
||
| xpack.evals.enabled: true |
There was a problem hiding this comment.
Added this to test the changes against the serverless test deployment from this PR as the evals golden cluster is broken.
I will be removing this line before merging this PR.
|
Thanks @viduni94 , this is a very good catch - I'm very surprised our existing tests didn't capture it though... what's going on here? Do you happen to know @rudolf ? In terms of the fix - we should proactively check whether we are running in serverless mode and decide based on that instead of waiting for the error and reacting to it. |
Thanks @flash1293 |
Run Metadata
FindingsNo findings -- all journeys completed without issues. ScreenshotsScreenshots are available in the workflow artifacts. Workflow run: https://github.com/elastic/kibana-exploratory-testing/actions/runs/24749151549 |
|
THanks @viduni94 , definitely better - we'll need to wait for the core team of course for how to proceed here. I'm still puzzled it worked so far... we have lots of tests running serverless that should catch this exact kind of thing, right? |
|
/ci |
@flash1293 I had to update the implementation slightly to pass the build flavour from the evals plugin as |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]
History
cc @viduni94 |
TinaHeiligers
left a comment
There was a problem hiding this comment.
config/serverless.oblt.yml still has xpack.evals.enabled: true with a TODO comment to remove before merging. That needs to come out.
The three-tier serverless detection (explicit flag, proactive esClient.info() check, reactive catch-and-retry) is well thought out and the tests cover all three paths plus the caching behavior. The rest looks good to me.
Thanks for the review @TinaHeiligers
Thank you 🙏🏻 |
SrdjanLL
left a comment
There was a problem hiding this comment.
LGTM - code review only!
I think in the long term we should have evals indices registered as system indices in ES (not a blocker at this early stage though).
I've created an issue to track this #264945 - feel free to add more context as needed.
…astic#264760) Closes elastic#264845 ## Summary Fixes index template creation on Serverless for indices `kibana-evaluation-datasets`, `kibana-evaluation-dataset-examples`). PR elastic#263096 added `auto_expand_replicas` and `number_of_shards` to index templates in `StorageIndexAdapter`. Serverless ES rejects these settings on non-system indices with an `illegal_argument_exception`, while hidden indices (e.g.: used by Streams) are unaffected because Kibana manages them as system indices. ### Dataset upsert error for Kibana evaluation runs <img width="1247" height="473" alt="image" src="https://github.com/user-attachments/assets/10e75668-7a1d-462e-9594-37fbee0f08e3" /> ### Error in logs: ``` Failed to upsert evaluation dataset: ResponseError: illegal_argument_exception Root causes: illegal_argument_exception: Settings [index.auto_expand_replicas,index.number_of_shards] are not available when running in serverless mode ``` ## Fix The changes were introduced in three tiers to detect serverless environments for index template settings: - Explicit detection - Introduced a new `isServerless` option in `StorageIndexAdapterOptions`. When provided, the adapter skips or includes settings without any extra calls. - Proactive - if `isServerless` is not provided, the adapter calls `esClient.info()` on the first write and checks `version.build_flavor`. The result is cached for the adapter's lifetime. - Reactive - if both above are unavailable (e.g.: `info()` fails due to insufficient privileges), the adapter catches the `illegal_argument_exception` on the first write, retries without settings, and caches the result. The Evals plugin passes `isServerless` explicitly because the evals route handler creates `StorageIndexAdapter` with `esClient.asCurrentUser`, which is scoped to the caller's API key. This API key may lack the monitor cluster privilege needed for `esClient.info()`, making tier 2 unreliable. There `buildFlavor` is passed from the plugin context. ## Test Plan - [x] Deploy the fix to a serverless project from this PR - [x] Create a config file (e.g.: `config.testcluster.json`) and add the serverless project URL as the dataset target - [x] Run evals with `node scripts/evals start --suite significant-events --project eis-anthropic-claude-4-6-sonnet --judge eis-google-gemini-3-1-pro --export-profile local --datasets-profile testcluster` ### With this fix, the dataset upsert works as expected <img width="1531" height="877" alt="image" src="https://github.com/user-attachments/assets/84c2a5cd-138b-457e-85d3-bd87bff4867c" /> <img width="1710" height="556" alt="image" src="https://github.com/user-attachments/assets/bbfeb03a-405f-4551-8326-e12b0192d332" /> ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels.
…astic#264760) Closes elastic#264845 ## Summary Fixes index template creation on Serverless for indices `kibana-evaluation-datasets`, `kibana-evaluation-dataset-examples`). PR elastic#263096 added `auto_expand_replicas` and `number_of_shards` to index templates in `StorageIndexAdapter`. Serverless ES rejects these settings on non-system indices with an `illegal_argument_exception`, while hidden indices (e.g.: used by Streams) are unaffected because Kibana manages them as system indices. ### Dataset upsert error for Kibana evaluation runs <img width="1247" height="473" alt="image" src="https://github.com/user-attachments/assets/10e75668-7a1d-462e-9594-37fbee0f08e3" /> ### Error in logs: ``` Failed to upsert evaluation dataset: ResponseError: illegal_argument_exception Root causes: illegal_argument_exception: Settings [index.auto_expand_replicas,index.number_of_shards] are not available when running in serverless mode ``` ## Fix The changes were introduced in three tiers to detect serverless environments for index template settings: - Explicit detection - Introduced a new `isServerless` option in `StorageIndexAdapterOptions`. When provided, the adapter skips or includes settings without any extra calls. - Proactive - if `isServerless` is not provided, the adapter calls `esClient.info()` on the first write and checks `version.build_flavor`. The result is cached for the adapter's lifetime. - Reactive - if both above are unavailable (e.g.: `info()` fails due to insufficient privileges), the adapter catches the `illegal_argument_exception` on the first write, retries without settings, and caches the result. The Evals plugin passes `isServerless` explicitly because the evals route handler creates `StorageIndexAdapter` with `esClient.asCurrentUser`, which is scoped to the caller's API key. This API key may lack the monitor cluster privilege needed for `esClient.info()`, making tier 2 unreliable. There `buildFlavor` is passed from the plugin context. ## Test Plan - [x] Deploy the fix to a serverless project from this PR - [x] Create a config file (e.g.: `config.testcluster.json`) and add the serverless project URL as the dataset target - [x] Run evals with `node scripts/evals start --suite significant-events --project eis-anthropic-claude-4-6-sonnet --judge eis-google-gemini-3-1-pro --export-profile local --datasets-profile testcluster` ### With this fix, the dataset upsert works as expected <img width="1531" height="877" alt="image" src="https://github.com/user-attachments/assets/84c2a5cd-138b-457e-85d3-bd87bff4867c" /> <img width="1710" height="556" alt="image" src="https://github.com/user-attachments/assets/bbfeb03a-405f-4551-8326-e12b0192d332" /> ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels.
…astic#264760) Closes elastic#264845 ## Summary Fixes index template creation on Serverless for indices `kibana-evaluation-datasets`, `kibana-evaluation-dataset-examples`). PR elastic#263096 added `auto_expand_replicas` and `number_of_shards` to index templates in `StorageIndexAdapter`. Serverless ES rejects these settings on non-system indices with an `illegal_argument_exception`, while hidden indices (e.g.: used by Streams) are unaffected because Kibana manages them as system indices. ### Dataset upsert error for Kibana evaluation runs <img width="1247" height="473" alt="image" src="https://github.com/user-attachments/assets/10e75668-7a1d-462e-9594-37fbee0f08e3" /> ### Error in logs: ``` Failed to upsert evaluation dataset: ResponseError: illegal_argument_exception Root causes: illegal_argument_exception: Settings [index.auto_expand_replicas,index.number_of_shards] are not available when running in serverless mode ``` ## Fix The changes were introduced in three tiers to detect serverless environments for index template settings: - Explicit detection - Introduced a new `isServerless` option in `StorageIndexAdapterOptions`. When provided, the adapter skips or includes settings without any extra calls. - Proactive - if `isServerless` is not provided, the adapter calls `esClient.info()` on the first write and checks `version.build_flavor`. The result is cached for the adapter's lifetime. - Reactive - if both above are unavailable (e.g.: `info()` fails due to insufficient privileges), the adapter catches the `illegal_argument_exception` on the first write, retries without settings, and caches the result. The Evals plugin passes `isServerless` explicitly because the evals route handler creates `StorageIndexAdapter` with `esClient.asCurrentUser`, which is scoped to the caller's API key. This API key may lack the monitor cluster privilege needed for `esClient.info()`, making tier 2 unreliable. There `buildFlavor` is passed from the plugin context. ## Test Plan - [x] Deploy the fix to a serverless project from this PR - [x] Create a config file (e.g.: `config.testcluster.json`) and add the serverless project URL as the dataset target - [x] Run evals with `node scripts/evals start --suite significant-events --project eis-anthropic-claude-4-6-sonnet --judge eis-google-gemini-3-1-pro --export-profile local --datasets-profile testcluster` ### With this fix, the dataset upsert works as expected <img width="1531" height="877" alt="image" src="https://github.com/user-attachments/assets/84c2a5cd-138b-457e-85d3-bd87bff4867c" /> <img width="1710" height="556" alt="image" src="https://github.com/user-attachments/assets/bbfeb03a-405f-4551-8326-e12b0192d332" /> ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels.
Closes #264845
Summary
Fixes index template creation on Serverless for indices
kibana-evaluation-datasets,kibana-evaluation-dataset-examples).PR #263096 added
auto_expand_replicasandnumber_of_shardsto index templates inStorageIndexAdapter. Serverless ES rejects these settings on non-system indices with anillegal_argument_exception, while hidden indices (e.g.: used by Streams) are unaffected because Kibana manages them as system indices.Dataset upsert error for Kibana evaluation runs
Error in logs:
Fix
The changes were introduced in three tiers to detect serverless environments for index template settings:
isServerlessoption inStorageIndexAdapterOptions. When provided, the adapter skips or includes settings without any extra calls.isServerlessis not provided, the adapter callsesClient.info()on the first write and checksversion.build_flavor. The result is cached for the adapter's lifetime.info()fails due to insufficient privileges), the adapter catches theillegal_argument_exceptionon the first write, retries without settings, and caches the result.The Evals plugin passes
isServerlessexplicitly because the evals route handler createsStorageIndexAdapterwithesClient.asCurrentUser, which is scoped to the caller's API key. This API key may lack the monitor cluster privilege needed foresClient.info(), making tier 2 unreliable. TherebuildFlavoris passed from the plugin context.Test Plan
config.testcluster.json) and add the serverless project URL as the dataset targetnode scripts/evals start --suite significant-events --project eis-anthropic-claude-4-6-sonnet --judge eis-google-gemini-3-1-pro --export-profile local --datasets-profile testclusterWith this fix, the dataset upsert works as expected
Checklist
release_note:*label is applied per the guidelinesbackport:*labels.